On Entropy-Based Data Mining
نویسندگان
چکیده
In the real world, we are confronted not only with complex and high-dimensional data sets, but usually with noisy, incomplete and uncertain data, where the application of traditional methods of knowledge discovery and data mining always entail the danger of modeling artifacts. Originally, information entropy was introduced by Shannon (1949), as a measure of uncertainty in the data. But up to the present, there have emerged many different types of entropy methods with a large number of different purposes and possible application areas. In this paper, we briefly discuss the applicability of entropy methods for the use in knowledge discovery and data mining, with particular emphasis on biomedical data. We present a very short overview of the state-of-theart, with focus on four methods: Approximate Entropy (ApEn), Sample Entropy (SampEn), Fuzzy Entropy (FuzzyEn), and Topological Entropy (FiniteTopEn). Finally, we discuss some open problems and future research challenges.
منابع مشابه
Entropy-based Consensus for Distributed Data Clustering
The increasingly larger scale of available data and the more restrictive concerns on their privacy are some of the challenging aspects of data mining today. In this paper, Entropy-based Consensus on Cluster Centers (EC3) is introduced for clustering in distributed systems with a consideration for confidentiality of data; i.e. it is the negotiations among local cluster centers that are used in t...
متن کاملClassification Through Machine Learning Technique: C4. 5 Algorithm based on Various Entropies
Data mining is an interdisciplinary field of computer science and is referred to extracting or mining knowledge from large amounts of data. Classification is one of the data mining techniques that maps the data into the predefined classes and groups. It is used to predict group membership for data instances. There are many areas that adapt Data mining techniques such as medical, marketing, tele...
متن کاملEntropy Based Fuzzy Rule Weighting for Hierarchical Intrusion Detection
Predicting different behaviors in computer networks is the subject of many data mining researches. Providing a balanced Intrusion Detection System (IDS) that directly addresses the trade-off between the ability to detect new attack types and providing low false detection rate is a fundamental challenge. Many of the proposed methods perform well in one of the two aspects, and concentrate on a su...
متن کاملA Novel Feature Selection Method based on an Integrated Data Envelopment Analysis and Entropy Model
Data mining is a one of the growing sciences in the world that can play a competitive advantages rule in many firms. Data mining algorithms based on their functions can be divided in four categories; o Classification o Feature selection o Assassination rules o Clustering 03/06/2014 ITQM2014 2 DEA Entropy Method DataSets
متن کاملMining Time Series Data Based upon Cloud Model
In recent years many attempts have been made to index, cluster, classify and mine prediction rules from increasing massive sources of spatial time-series data. In this paper, a novel approach of mining time-series data is proposed based on cloud model, which described by numerical characteristics. Firstly, the cloud model theory is introduced into the time series data mining. Time-series data c...
متن کامل